The Smart/Empire TIPSTER IR System

نویسندگان

  • Chris Buckley
  • Janet A. Walz
  • Claire Cardie
  • Scott Mardis
  • Mandar Mitra
  • David R. Pierce
  • Kiri Wagstaff
چکیده

We attack each task through a combination of statistical and linguistic approaches. The proposed statistical approaches extend existing methods in IR by performing statistical computations within the context of another query or document. The proposed linguistic approaches build on existing work in information extraction and rely on a new technique for trainable partial parsing. In short, our integrated approach uses both statistical and linguistic sources to identify selected relationships among important terms in a query or text. The relationships are encoded as TIPSTER annotations [7]. We then use the extracted relationships: (1) to discard or reorder retrieved texts (for high-precision text retrieval); (2) to locate redundant information (for near-duplicate document detection); and (3) to generate coherent synopses (for context-dependent text summarization).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Cornell TIPSTER Phase III Project

The overall objective of the Cornell University TIPSTER Project was to improve end-user efficiency in information retrieval systems by reducing the amount of text that the user must process [1]. The project focuses on high precision IR, near-duplicate detection and context-dependent summarization. The two main foundations of the research are the latest version of the Smart system for informatio...

متن کامل

Improving End-User Efficiency Using the Smart/Empire IR System

We attack each task through a combination of statistical and linguistic approaches. The proposed statistical approaches extend existing methods in IR by performing computations within the context of another query or document. The proposed linguistic approaches build on existing work in information extraction and rely on a new technique for trainable partial parsing. In short, our integrated app...

متن کامل

Automatic Text Summarization in TIPSTER

Automatic Text Summarization was added as a major research thrust of the TIPSTER program during TIPSTER Phase III, 1996-1998. It is a natural extension of the previously supported research efforts in Information Extraction (IE) and Information Retrieval (IR). There is considerable interest in automatically producing summaries due, in large part, to the growth of the Internet and the World Wide ...

متن کامل

Enhancing Detection through Linguistic Indexing and Topic Expansion

Natural language processing techniques may hold a tremendous potential for overcoming the inadequacies of purely quantitative methods of text information retrieval. Under the Tipster contracts in phases I through III, GE group has set out to explore this potential through development and evaluation of new text processing techniques. This work resulted in some significant advances and in a bette...

متن کامل

A Simple Probabilistic Approach to Classification And Routing

Several classiiiCation and routing methods were implemented and compared. The experiments used FBIS documents from four categories, and the measures used were the tf.idf and Cosine similarity measures, and a maximum likelihood estimate based on assuming a Multinomial Distribution for the various topics (populations). In addition, the SMART program was run with 'lnc.ltc' weighting and compared t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998